A Hybrid Approach for Privacy-preserving Record Linkage
نویسنده
چکیده
The integration of information dispersed among multiple repositories is a crucial step for accurate data analysis in various domains. In support of this goal, it is critical to devise procedures for identifying similar records across distinct data sources. At the same time, to adhere to privacy regulations and policies, such procedures should protect the confidentiality of the individuals to whom the information corresponds. Various private record linkage (PRL) protocols have been proposed to achieve this goal, involving secure multi-party computation (SMC) and similarity preserving data transformation techniques. SMC methods provide secure and accurate solutions to the PRL problem, but are prohibitively expensive in practice for large data sets, mainly due to excessive computational requirements. Data transformation techniques offer more practical solutions, but incur the cost of information leakage and false matches. In this talk, we discuss how the performance of SMC based PRL techniques could be significantly improved by combining them with data sanitization techniques without incurring the cost of information leakage and false matches. Furthermore, we discuss how to efficiently handle typographical errors exist in data during the PRL protocol execution. BIO Dr. Murat Kantarcioglu is an Associate Professor in the Computer Science Department and Director of the UTD Data Security and Privacy Lab at the University of Texas at Dallas. He holds a B.S. in Computer Engineering from Middle East Technical University, and M.S. and Ph.D degrees in Computer Science from Purdue University. He is a recipient of NSF CAREER award and Purdue CERIAS Diamond Award for Academic excellence. Currently, he is a visiting scholar at Harvard Data Privacy Lab. Dr. Kantarcioglu's research focuses on creating technologies that can efficiently extract useful information from any data without sacrificing privacy or security. His research has been supported by grants from NSF, AFOSR, ONR, NSA, and NIH. He has published over 100 peer reviewed papers. Some of his research work has been covered by the media outlets such as Boston Globe, ABC News etc. and has received two best paper awards. (c) 2014, Copyright is with the authors. Published in the Workshop Proceedings of the EDBT/ICDT 2014 Joint Conference (March 28, 2014, Athens, Greece) on CEUR-WS.org (ISSN 1613-0073). Distribution of this paper is permitted under the terms of the Creative Commons license CC-by-nc-nd 4.0 7th International Workshop on Privacy and Anonymity in the Information Society (PAIS’14) March 28, 2014, Athens, Greece
منابع مشابه
Privacy preserving interactive record linkage (PPIRL)
OBJECTIVE Record linkage to integrate uncoordinated databases is critical in biomedical research using Big Data. Balancing privacy protection against the need for high quality record linkage requires a human-machine hybrid system to safely manage uncertainty in the ever changing streams of chaotic Big Data. METHODS In the computer science literature, private record linkage is the most publish...
متن کاملTree Based Scalable Indexing for Multi-Party Privacy-Preserving Record Linkage
Recently, the linking of multiple databases to identify common sets of records has gained increasing recognition in application areas such as banking, health, insurance, etc. Often the databases to be linked contain sensitive information, where the owners of the databases do not want to share any details with any other party due to privacy concerns. The linkage of records in different databases...
متن کاملPrivacy Preserving Group Linkage
The problem of privacy-preserving record linkage is to find the intersection of records from two parties, while not revealing any private records to each other. Recently, group linkage has been introduced to measure the similarity of groups of records [19]. When we extend the traditional privacy-preserving record linkage methods to group linkage measurement, group membership privacy becomes vul...
متن کاملPrivacy Preserving Record Linkage with PPJoin
Privacy-preserving record linkage (PPRL) becomes increasingly important to match and integrate records with sensitive data. PPRL not only has to preserve the anonymity of the persons or entities involved but should also be highly efficient and scalable to large datasets. We therefore investigate how to adapt PPJoin, one of the fastest approaches for regular record linkage, to PPRL resulting in ...
متن کاملPrivacy-preserving record linkage using Bloom filters
BACKGROUND Combining multiple databases with disjunctive or additional information on the same person is occurring increasingly throughout research. If unique identification numbers for these individuals are not available, probabilistic record linkage is used for the identification of matching record pairs. In many applications, identifiers have to be encrypted due to privacy concerns. METHOD...
متن کاملVulnerabilities in the use of similarity tables in combination with pseudonymisation to preserve data privacy in the UK Office for National Statistics' Privacy-Preserving Record Linkage
In the course of a survey of privacy-preserving record linkage, we reviewed the approach taken by the UK Office for National Statistics (ONS) as described in their series of reports “Beyond 2011”. Our review identifies a number of matters of concern. Some of the issues discovered are sufficiently severe to present a risk to privacy. The issues discovered are as follows, in order of severity, fr...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014